black patient
Can SAEs reveal and mitigate racial biases of LLMs in healthcare?
Ahsan, Hiba, Wallace, Byron C.
LLMs are increasingly being used in healthcare. This promises to free physicians from drudgery, enabling better care to be delivered at scale. But the use of LLMs in this space also brings risks; for example, such models may worsen existing biases. How can we spot when LLMs are (spuriously) relying on patient race to inform predictions? In this work we assess the degree to which Sparse Autoencoders (SAEs) can reveal (and control) associations the model has made between race and stigmatizing concepts. We first identify SAE latents in Gemma-2 models which appear to correlate with Black individuals. We find that this latent activates on reasonable input sequences (e.g., "African American") but also problematic words like "incarceration". We then show that we can use this latent to steer models to generate outputs about Black patients, and further that this can induce problematic associations in model outputs as a result. For example, activating the Black latent increases the risk assigned to the probability that a patient will become "belligerent". We evaluate the degree to which such steering via latents might be useful for mitigating bias. We find that this offers improvements in simple settings, but is less successful for more realistic and complex clinical tasks. Overall, our results suggest that: SAEs may offer a useful tool in clinical applications of LLMs to identify problematic reliance on demographics but mitigating bias via SAE steering appears to be of marginal utility for realistic tasks.
- North America > United States > Virginia (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Israel (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Quantifying Clinician Bias and its Effects on Schizophrenia Diagnosis in the Emergency Department of the Mount Sinai Health System
Valentine, Alissa A., Lepow, Lauren A., Chan, Lili, Charney, Alexander W., Landi, Isotta
In the United States, schizophrenia (SCZ) carries a race and sex disparity that may be explained by clinician bias - a belief held by a clinician about a patient that prevents impartial clinical decision making. The emergency department (ED) is marked by higher rates of stress that lead to clinicians relying more on implicit biases during decision making. In this work, we considered a large cohort of psychiatric patients in the ED from the Mount Sinai Health System (MSHS) in New York City to investigate the effects of clinician bias on SCZ diagnosis while controlling for known risk factors and patient sociodemographic information. Clinician bias was quantified as the ratio of negative to total sentences within a patient's first ED note. We utilized a logistic regression to predict SCZ diagnosis given patient race, sex, age, history of trauma or substance use disorder, and the ratio of negative sentences. Our findings showed that an increased ratio of negative sentences is associated with higher odds of obtaining a SCZ diagnosis [OR (95% CI)=1.408 (1.361-1.456)]. Identifying as male [OR (95% CI)=1.112 (1.055-1.173)] or Black [OR (95% CI)=1.081(1.031-1.133)] increased one's odds of being diagnosed with SCZ. However, from an intersectional lens, Black female patients with high SES have the highest odds of obtaining a SCZ diagnosis [OR (95% CI)=1.629 (1.535-1.729)]. Results such as these suggest that SES does not act as a protective buffer against SCZ diagnosis in all patients, demanding more attention to the quantification of health disparities. Lastly, we demonstrated that clinician bias is operational with real world data and related to increased odds of obtaining a stigmatizing diagnosis such as SCZ.
- North America > United States > New York (0.24)
- North America > United States > Alaska (0.05)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)
Can large language models be privacy preserving and fair medical coders?
Dadsetan, Ali, Soleymani, Dorsa, Zeng, Xijie, Rudzicz, Frank
Protecting patient data privacy is a critical concern when deploying machine learning algorithms in healthcare. Differential privacy (DP) is a common method for preserving privacy in such settings and, in this work, we examine two key trade-offs in applying DP to the NLP task of medical coding (ICD classification). Regarding the privacy-utility trade-off, we observe a significant performance drop in the privacy preserving models, with more than a 40% reduction in micro F1 scores on the top 50 labels in the MIMIC-III dataset. From the perspective of the privacy-fairness trade-off, we also observe an increase of over 3% in the recall gap between male and female patients in the DP models. Further understanding these trade-offs will help towards the challenges of real-world deployment.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (3 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
An AI-Guided Data Centric Strategy to Detect and Mitigate Biases in Healthcare Datasets
Gulamali, Faris F., Sawant, Ashwin S., Liharska, Lora, Horowitz, Carol R., Chan, Lili, Kovatch, Patricia H., Hofer, Ira, Singh, Karandeep, Richardson, Lynne D., Mensah, Emmanuel, Charney, Alexander W, Reich, David L., Hu, Jianying, Nadkarni, Girish N.
The adoption of diagnosis and prognostic algorithms in healthcare has led to concerns about the perpetuation of bias against disadvantaged groups of individuals. Deep learning methods to detect and mitigate bias have revolved around modifying models, optimization strategies, and threshold calibration with varying levels of success. Here, we generate a data-centric, model-agnostic, task-agnostic approach to evaluate dataset bias by investigating the relationship between how easily different groups are learned at small sample sizes (AEquity). We then apply a systematic analysis of AEq values across subpopulations to identify and mitigate manifestations of racial bias in two known cases in healthcare - Chest X-rays diagnosis with deep convolutional neural networks and healthcare utilization prediction with multivariate logistic regression. AEq is a novel and broadly applicable metric that can be applied to advance equity by diagnosing and remediating bias in healthcare datasets.
- North America > United States > New York (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Echoes of Biases: How Stigmatizing Language Affects AI Performance
Liu, Yizhi, Wang, Weiguang, Gao, Guodong Gordon, Agarwal, Ritu
Electronic health records (EHRs) serve as an essential data source for the envisioned artificial intelligence (AI)-driven transformation in healthcare. However, clinician biases reflected in EHR notes can lead to AI models inheriting and amplifying these biases, perpetuating health disparities. This study investigates the impact of stigmatizing language (SL) in EHR notes on mortality prediction using a Transformer-based deep learning model and explainable AI (XAI) techniques. Our findings demonstrate that SL written by clinicians adversely affects AI performance, particularly so for black patients, highlighting SL as a source of racial disparity in AI model development. To explore an operationally efficient way to mitigate SL's impact, we investigate patterns in the generation of SL through a clinicians' collaborative network, identifying central clinicians as having a stronger impact on racial disparity in the AI model. We find that removing SL written by central clinicians is a more efficient bias reduction strategy than eliminating all SL in the entire corpus of data. This study provides actionable insights for responsible AI development and contributes to understanding clinician behavior and EHR note writing in healthcare.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Netherlands > South Holland > Leiden (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Health Care Bias Is Dangerous. But So Are 'Fairness' Algorithms
Mental and physical health are crucial contributors to living happy and fulfilled lives. How we feel impacts the work we perform, the social relationships we forge, and the care we provide for our loved ones. Because the stakes are so high, people often turn to technology to help keep our communities safe. Artificial intelligence is one of the big hopes, and many companies are investing heavily in tech to serve growing health needs across the world. And many promising examples exist: AI can be used to detect cancer, triage patients, and make treatment recommendations.
Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations
Adam, Hammaad, Yang, Ming Ying, Cato, Kenrick, Baldini, Ioana, Senteio, Charles, Celi, Leo Anthony, Zeng, Jiaming, Singh, Moninder, Ghassemi, Marzyeh
Clinical notes are becoming an increasingly important data source for machine learning (ML) applications in healthcare. Prior research has shown that deploying ML models can perpetuate existing biases against racial minorities, as bias can be implicitly embedded in data. In this study, we investigate the level of implicit race information available to ML models and human experts and the implications of model-detectable differences in clinical notes. Our work makes three key contributions. First, we find that models can identify patient self-reported race from clinical notes even when the notes are stripped of explicit indicators of race. Second, we determine that human experts are not able to accurately predict patient race from the same redacted clinical notes. Finally, we demonstrate the potential harm of this implicit information in a simulation study, and show that models trained on these race-redacted clinical notes can still perpetuate existing biases in clinical treatment decisions.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Assessing Phenotype Definitions for Algorithmic Fairness
Sun, Tony Y., Bhave, Shreyas, Altosaar, Jaan, Elhadad, Noémie
Disease identification is a core, routine activity in observational health research. Cohorts impact downstream analyses, such as how a condition is characterized, how patient risk is defined, and what treatments are studied. It is thus critical to ensure that selected cohorts are representative of all patients, independently of their demographics or social determinants of health. While there are multiple potential sources of bias when constructing phenotype definitions which may affect their fairness, it is not standard in the field of phenotyping to consider the impact of different definitions across subgroups of patients. In this paper, we propose a set of best practices to assess the fairness of phenotype definitions. We leverage established fairness metrics commonly used in predictive models and relate them to commonly used epidemiological cohort description metrics. We describe an empirical study for Crohn's disease and diabetes type 2, each with multiple phenotype definitions taken from the literature across two sets of patient subgroups (gender and race). We show that the different phenotype definitions exhibit widely varying and disparate performance according to the different fairness metrics and subgroups. We hope that the proposed best practices can help in constructing fair and inclusive phenotype definitions.
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > Ontario (0.04)
- Europe > Germany (0.04)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.97)
AI has yet to revolutionize health care
Investors have honed in on artificial intelligence as the next big thing in health care, with billions flowing into AI-enabled digital health startups in recent years. But the technology has yet to transform medicine in the way many predicted, Ben and Ruth report. "Companies come in promising the world and often don't deliver," Bob Wachter, head of the department of medicine at the University of California, San Francisco, told Future Pulse. "When I look for examples of … true AI and machine learning that's really making a difference, they're pretty few and far between. Administrators say that algorithms from third-party firms often don't work seamlessly because every health system has its own tech system, so hospitals are developing their own in-house AI.
- North America > United States > California > San Francisco County > San Francisco (0.55)
- North America > United States > New York (0.05)
- North America > United States > Nebraska (0.05)
- (2 more...)
How AI Can Remedy Racial Disparities In Healthcare
The story of American medicine is one of incredible scientific advancements, from the use of penicillin to treat syphilis and other bacterial infections to the countless biomedical breakthroughs made possible by cell-line research. Too often, however, these stories ignore an uncomfortable truth: Some of our nation's most significant medical discoveries were made possible through the mistreatment of Black patients--from the exploitation of African American farmers during the Tuskegee Syphilis Experiments to the tragic case of Henrietta Lacks, a black patient whose cells were stolen by doctors and used for decades of cell-line research. Racism is woven into our nation's medical past but is also part of our present, as evidenced by the Covid-19 crisis. From testing to treatment, Black and Latino patients have received a lower quality and quantity of care compared white Americans. As a country, we now have the opportunity to reverse course.